Capturing Long-distance Dependencies in Sequence Models: A Case Study of Chinese Part-of-speech Tagging

نویسندگان

  • Weiwei Sun
  • Xiaochang Peng
  • Xiaojun Wan
چکیده

This paper is concerned with capturing long-distance dependencies in sequence models. We propose a two-step strategy. First, the stacked learning technique is applied to integrate sequence models that are good at exploring local information and other high complexity models that are good at capturing long-distance dependencies. Second, the structure compilation technique is employed to transfer the predictive power of hybrid models to sequence models via large-scale unlabeled data. To investigate the feasibility of our idea, we study Chinese POS tagging. Experiments on the Chinese Treebank data demonstrate the effectiveness of our methods. The re-compiled models not only achieve high accuracy with respect to per token classification, but also serve as a front-end to a parser well.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improving Chinese POS Tagging with Dependency Parsing

Recent research usually models POS tagging as a sequential labeling problem, in which only local context features can be used. Due to the lack of morphological inflections, many tagging ambiguities in Chinese are difficult to handle unless consulting larger contexts. In this paper, we try to improve Chinese POS tagging by using long-distance dependencies produced by a statistical dependency par...

متن کامل

برچسب‌گذاری ادات سخن زبان فارسی با استفاده از مدل شبکۀ فازی

Part of speech tagging (POS tagging) is an ongoing research in natural language processing (NLP) applications. The process of classifying words into their parts of speech and labeling them accordingly is known as part-of-speech tagging, POS-tagging, or simply tagging. Parts of speech are also known as word classes or lexical categories. The purpose of POS tagging is determining the grammatical ...

متن کامل

An improved joint model: POS tagging and dependency parsing

Dependency parsing is a way of syntactic parsing and a natural language that automatically analyzes the dependency structure of sentences, and the input for each sentence creates a dependency graph. Part-Of-Speech (POS) tagging is a prerequisite for dependency parsing. Generally, dependency parsers do the POS tagging task along with dependency parsing in a pipeline mode. Unfortunately, in pipel...

متن کامل

Tree Memory Networks for Modelling Long-term Temporal Dependencies

In the domain of sequence modelling, Recurrent Neural Networks (RNN) have been capable of achieving impressive results in a variety of application areas including visual question answering, part-of-speech tagging and machine translation. However this success in modelling short term dependencies has not successfully transitioned to application areas such as trajectory prediction, which require c...

متن کامل

سیستم برچسب گذاری اجزای واژگانی کلام در زبان فارسی

Abstract: Part-Of-Speech (POS) tagging is essential work for many models and methods in other areas in natural language processing such as machine translation, spell checker, text-to-speech, automatic speech recognition, etc. So far, high accurate POS taggers have been created in many languages. In this paper, we focus on POS tagging in the Persian language. Because of problems in Persian POS t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013